Exploring Predicate-Argument Relations for Named Entity Recognition in the Molecular Biology Domain

نویسندگان

  • Tuangthong Wattarujeekrit
  • Nigel Collier
چکیده

In this paper, the semantic relationships between a predicate and its arguments in terms of semantic roles are employed to improve lexical-based named entity recognition (NER) in the molecular biology domain. The semantic roles were realized in various sets of syntactic features used by a machine learning model to explore what should be the efficient way in allowing this knowledge to provide the highest positive effect on the NER. The empirical results show that the best feature set consists of predicate’s surface form, predicate’s lemma, voice, and the united feature of subject-object head’s lemma and transitive-intransitive sense. The performance improvement from using these features indicates the advantage of the predicate-argument semantic knowledge on NER. There are still rooms to enhance NER by using this semantic knowledge (e.g. to employ other semantic roles besides agent and theme and to extend the rules for efficient identification of an argument’s boundary).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

Exploring Semantic Roles for Named Entity Recognition in the Molecular Biology Domain

.................................................................................................................................. 3 Acknowledgements ................................................................................................................. 5

متن کامل

سیستم شناسایی و طبقه‌بندی موجودیت‌های اسمی در متون زبان فارسی بر پایه شبکه عصبی

Named Entity Recognition (NER) is a fundamental task in natural language processing and also known as a subset of information extraction. We seek to locate and classify named entities in text into predefined categories such as the names of persons, organizations, locations, expressions of times, etc. Named Entity Recognition for English texts has been researched widely for the past years, howev...

متن کامل

An Eye-tracking Study of Named Entity Annotation

Utilising effective features in machine learning-based natural language processing (NLP) is crucial in achieving good performance for a given NLP task. The paper describes a pilot study on the analysis of eye-tracking data during named entity (NE) annotation, aiming at obtaining insights into effective features for the NE recognition task. The eye gaze data were collected from 10 annotators and...

متن کامل

Annotation of Predicate-argument Structure on Molecular Biology Text

Annotated corpora are essential resources for natural language processing. This paper describes our approach for building a corpus annotated with predicateargument structure on research abstracts in molecular biology domain. Observation of the records in a database of cell signaling events and corresponding research abstracts showed that extracting predicateargument structure is a useful interm...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005